Multi-level Schema Extraction for Heterogenous Semi-structured Data

نویسندگان

  • Jong P. Yoon
  • Vijay V. Raghavan
چکیده

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic integration of Heterogenous XML-schemas

Due to the XML’s flexibility and semi-structured nature, complications arise when trying to transplant data from one XML to another. Researchers have made great strides in solving the problem of integrating homogenous XML. But there are very few specifically addressing the problem of integrating heterogenous documents. We introduce XSD Matcher, a system for automatically mapping a collection of...

متن کامل

Semi-Structured Data Extraction and Schema Knowledge Mining

It is well known that World Wide Web has become a huge information resource. Therefore, it is very important for us to utilize this kind of information effectively. This paper proposes a semi-structured data extraction method to get the useful information embedded in a group of relevant web pages, and store it with OEM(Object Exchange Model). Then, we adopt data mining method to discover schema...

متن کامل

An ontology-based approach for resolving semantic schema conflicts in the extraction and integration of query-based information from heterogeneous web data sources

There are many external resources and heterogeneous data on the internet that an organization or user may need to improve the decision making process. It is therefore, very important and critical that this information are complete, precise and can be acquired on time. Most web sources provide data in semi-structured form on the internet. The combination of semi-structured data from different so...

متن کامل

Ontology Driven Web Extraction from Semi-structured and Unstructured Data for B2B Market Analysis

The Market Blended Insight project has the objective of improving the UK business to business marketing performance using the semantic web technologies. In this project, we are implementing an ontology driven web extraction and translation framework to supplement our backend triple store of UK companies, people and geographical information. It deals with both the semi-structured data and the un...

متن کامل

Modelling the Webspace of an Intranet

Searching the internet using the currently available search engines is not satisfactory. The techniques used there focus on the extraction of relevant information directly from the documents available on the web. We introduce a new approach, which aims at describing the content of a webspace, formed by a collection of related documents, instead of looking at the single documents. By identifying...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000